A Kernel Method for the Two-Sample-Problem
نویسندگان
چکیده
We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a third is based on the asymptotic distribution of this statistic. The test statistic can be computed in quadratic time, although efficient linear time approximations are available. Several classical metrics on distributions are recovered when the function space used to compute the difference in expectations is allowed to be more general (eg. a Banach space). We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests.
منابع مشابه
THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)
Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes. Small area estimation is needed in obtaining information on a small area, such as sub-district or village. Generally, in some cases, small area estimation uses parametric modeling. But in fact, a lot of models have no linear relationship between the small area average and the covariat...
متن کاملAn infeasible interior-point method for the $P*$-matrix linear complementarity problem based on a trigonometric kernel function with full-Newton step
An infeasible interior-point algorithm for solving the$P_*$-matrix linear complementarity problem based on a kernelfunction with trigonometric barrier term is analyzed. Each (main)iteration of the algorithm consists of a feasibility step andseveral centrality steps, whose feasibility step is induced by atrigonometric kernel function. The complexity result coincides withthe best result for infea...
متن کاملAn Effective Numerical Technique for Solving Second Order Linear Two-Point Boundary Value Problems with Deviating Argument
Based on reproducing kernel theory, an effective numerical technique is proposed for solving second order linear two-point boundary value problems with deviating argument. In this method, reproducing kernels with Chebyshev polynomial form are used (C-RKM). The convergence and an error estimation of the method are discussed. The efficiency and the accuracy of the method is demonstrated on some n...
متن کاملReproducing Kernel Hilbert Space(RKHS) method for solving singular perturbed initial value problem
In this paper, a numerical scheme for solving singular initial/boundary value problems presented.By applying the reproducing kernel Hilbert space method (RKHSM) for solving these problems,this method obtained to approximated solution. Numerical examples are given to demonstrate theaccuracy of the present method. The result obtained by the method and the exact solution are foundto be in good agr...
متن کاملA hybrid method for singularly perturbed delay boundary value problems exhibiting a right boundary layer
The aim of this paper is to present a numerical method for singularly perturbed convection-diffusion problems with a delay. The method is a combination of the asymptotic expansion technique and the reproducing kernel method (RKM). First an asymptotic expansion for the solution of the given singularly perturbed delayed boundary value problem is constructed. Then the reduced regular delayed diffe...
متن کاملThe combined reproducing kernel method and Taylor series for solving nonlinear Volterra-Fredholm integro-differential equations
In this letter, the numerical scheme of nonlinear Volterra-Fredholm integro-differential equations is proposed in a reproducing kernel Hilbert space (RKHS). The method is constructed based on the reproducing kernel properties in which the initial condition of the problem is satised. The nonlinear terms are replaced by its Taylor series. In this technique, the nonlinear Volterra-Fredholm integro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006